Measuring the Agreement Among Relevance Judges

نویسنده

  • S. Mizzaro
چکیده

The importance of the issue of the agreement (or disagreement) between relevance judges is increasing, since new kinds of relevance judgment expression are being used (to the classical dichotomous one, various researches have added scalar, weighted, and orders of various kind) and new media are being introduced (it is far quicker to judge the relevance of an image than a text, and thus the human judgments can be obtained more easily). This paper presents a coherent account of the disagreement between relevance judges and groups of judges. Judgment expressions of different kinds, grouped into two categories, are taken into account. To the first category, score judgments, belong the more classical dichotomous, scalar, and weighted. To the second one, order judgments, belong total (or linear) and partial (or weak) orders, both with or without equality. A uniform notation for describing relevance judgments of each kind is proposed; some of the problems arising when one tries to operationally measure the disagreement between judges are described; a measure for the disagreement of two judges expressing two judgments of the same kind is proposed; the disagreement of a group of more than two judges is discussed; and, finally, some experimental activity inspired by this study is sketched.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring User Relevance Criteria

Recently, Scholer and Turpin [Proc. SIGIR 2008] proposed the use of techniques from the field of psychophysics to determine a relevance threshold for a user. Using this threshold, they observed, one could match the relevance criteria of users to those of judges used to develop a test collection, hence selected users should have a better search experience with systems judged superior on that col...

متن کامل

Modelling Disagreement Between Judges for Information Retrieval System Evaluation

The batch evaluation of information retrieval systems typically makes use of a testbed consisting of a collection of documents, a set of queries, and for each query, a set of judgements indicating which documents are relevant. This paper presents a probabilistic model for predicting IR system rankings in a batch experiment when using document relevance assessments from different judges, using t...

متن کامل

Exploring fact-focused relevance and novelty detection

Purpose – Automated sentence-level relevance and novelty detection would be of direct benefit to many information retrieval systems. However, the low level of agreement between human judges performing the task is an issue of concern. In previous approaches, annotators were asked to identify sentences in a document set that are relevant to a given topic, and then to eliminate sentences that do n...

متن کامل

Analyses of Wine-Tasting Data: A Tutorial*

The purpose of this paper is to provide a tutorial of data analysis methods for answering questions that arise in analyzing data from wine-tasting events: (i) measuring agreement of two judges and its extension tom judges; (ii) making comparisons of judges across years; (iii) comparing two wines; (iv) designing tasting procedures to reduce burden of multiple tastings; (v) ranking of judges; and...

متن کامل

Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status

In this article we propose a strategy for the summarization of scientific articles that concentrates on the rhetorical status of statements in an article: Material for summaries is selected in such a way that summaries can highlight the new contribution of the source article and situate it with respect to earlier work. We provide a gold standard for summaries of this kind consisting of a substa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999